Symbolic Principal Components for Interval-valued Observations

نویسندگان

  • L. Billard
  • A. Douzal-Chouakria
  • E. Diday
چکیده

One feature of contemporary datasets is that instead of the single point value in the p-dimensional space < seen in classical data, the data may take interval values thus producing hypercubes in <. This paper extends the methodology of classical principal components to that for interval-valued data. Two methods are proposed, viz., a vertices method which uses all the vertices of the observation’s hypercube, and a centers method which uses the centroid values. Unlike classical data, each symbolic data point has internal variation. For both the vertices and centers methods, we obtain intervalvalued symbolic principal components which recapture the internal variation of the observations, as well as diagnostics such as correlation measures between these principal components and the random variables and/or the observations themselves. We also provide a visualization method that further aids in the interpretation of the methodology. The methods are illustrated in a dataset using measurements of facial characteristics obtained from a study of face recognition patterns for surveillance purposes, and in a dataset of species of bats where the measurements are naturally internal-valued. A comparison with analyses in which classical surrogates replace the intervals, shows how the symbolic analyses give more informative conclusions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Information from Interval Data Using Symbolic Principal Component Analysis

We address the definition of symbolic variance and covariance for random interval-valued variables, and present four known symbolic principal component estimation methods using a common insightful framework. In addition, we provide a simple explicit formula for the scores of the symbolic principal components, equivalent to the representation by Maximum Covering Area Rectangle. Furthermore, the ...

متن کامل

Principal Curves and Surfaces to Interval Valued Variables

In this paper we propose a generalization to symbolic interval valued variables of the Principal Curves and Surfaces method proposed by T. Hastie in [4]. Given a data set X with n observations and m continuos variables the main idea of Principal Curves and Surfaces method is to generalize the principal component line, providing a smooth one-dimensional curved approximation to a set of data poin...

متن کامل

Principal component analysis for interval-valued observations

One feature of contemporary datasets is that instead of the single point value in the p-dimensional space R seen in classical data, the data may take interval values thus producing hypercubes in R . This paper studies the vertices principal components methodology for interval-valued data; and provides enhancements to allow for so-called ‘trivial’ intervals, and generalized weight functions. It ...

متن کامل

Symbolic Covariance Matrix for Interval-valued Variables and its Application to Principal Component Analysis: a Case Study

In the last two decades, principal component analysis (PCA) was extended to interval-valued data; several adaptations of the classical approach are known from the literature. Our approach is based on the symbolic covariance matrix Cov for the interval-valued variables proposed by Billard (2008). Its crucial advantage, when compared to other approaches, is that it fully utilizes all the informat...

متن کامل

Acquiring Non Linear Subspace for Face Recognition using Symbolic Kernel PCA Method

In this paper, a new technique called symbolic kernel Principal Component Analysis (KPCA) is explored to develop a model for face representation and recognition. The conventional kernel PCA method extracts single valued features from the original image space to represent face images. The proposed method reduces the dimensionality of original image space by representing the face images as symbol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009